Lifetime changes in body fatness and breast density in postmenopausal women: the FEDRA study

Background High mammographic breast density (MBD) is an established risk factor for breast cancer (BC). Body fatness conveys an increased BC risk in postmenopause but is associated with less dense breasts. Here, we studied the relationship between body fatness and breast composition within the FEDRA (Florence-EPIC Digital mammographic density and breast cancer Risk Assessment) longitudinal study. Methods Repeated anthropometric data and MBD parameters (obtained through an automated software on BC screening digital mammograms) were available for all participants, as well as information on other BC risk factors. Multivariate linear regression and functional data analysis were used to longitudinally evaluate the association of body fatness, and changes thereof over time, with dense (DV) and non-dense (NDV) breast volumes and volumetric percent density (VPD). Results A total of 5,262 women were included, with anthropometric data available at 20 and 40 years of age, at EPIC baseline (mean 49.0 years), and an average of 9.4 years thereafter. The mean number of mammograms per woman was 3.3 (SD 1.6). Body fatness (and increases thereof) at any age was positively associated with DV and NDV (the association being consistently stronger for the latter), and inversely associated with VPD. For instance, an increase by 1 kg/year between the age of 40 years and EPIC baseline was significantly associated with 1.97% higher DV, 8.85% higher NDV, and 5.82% lower VPD. Conclusion Body fatness and its increase from young adulthood until midlife are inversely associated with volumetric percent density, but positively associated with dense and non-dense breast volumes in postmenopausal women. Supplementary Information The online version contains supplementary material available at 10.1186/s13058-023-01624-5.

In the latest decades, thanks to increasing capacity in computer calculus and data storing, it become possible to approach longitudinal studies through functional data analysis, that represents observed data as functions of variable(s) of interest that vary over a given domain. In this work we applied the Function-On-Scalar (FoS) model to study the variability of mammographic breast density over time. The FEDRA study includes 5,262 women with a mean number of 3.28 (SD 1.63) consecutive digital mammographic examination for each subject. We studied the effect of covariates of interest, with a main focus on the body mass index (BMI) measured at the ages 20 and 40, over a timevarying function of the mammographic density. The FoS model proposed below and the notation that follows have been deeply discussed by Ramsay J.O. and Silverman B.W. [1]: Where yi(t) is the dependent variable for which a prediction for its expected value is obtained through the linear predictor Xβ. The main differences between the classical linear approach and the functional data analysis is that the observed data points, whether they are related to describe the dependent or independent variables, are supposed to arise from a function that vary over a determined domain, here time (or the patients' age). The functional arguments are the dependent variable (here noted as a function of time for the i-th individual), and the β coefficients that describe the effect of the covariates over time. The ϵi term is the error term which is supposed to arise from a stochastic process with an expected value equal to 0 [2].
Such functions are obtained consequently to a basis expansion; in fact, functional data are described through a linear combination of what are usually called functional building blocks. More generally, an x function of t is specified by K basis functions and the relative ck coefficients: Lots are the choices to construct the basis system. In this analysis we chose b-splines (splines builded through a set of basis functions which are themselves splines) a flexible tool to deal with non-periodic functions (such are supposed to be the functions analysed). A more in-depth description can be found in James et al. [3].
The time dependent analysis requests a high informative dataset with a good representation of the time series. In our analysis, in order to reach a sufficient informative value of the functions and, in the same time, to maintain enough individuals in the analyses 1,765 subjects were considered, with 4 consecutive mammographic examination and no missing data for the following study covariates: BMI at the age of 20, BMI at the age of 40, menopausal status, age at the menarche and birth index.
The analysis has been developed by R using the fda package [4]. The 4 consecutive mammographic examinations were unevenly spaced, and the observation windows varied through individuals, such that the t0 (age at first mammographic examination) for the subject i could be > tn (age at last mammographic examination) for the subject j. A direct use of fda package was not possible under these conditions. To overcome this issue, for each subject, we used order 4 b-spline basis to extrapolate, for a given time sequence, the unobserved values common to each time-series. Moreover, given the necessity to observe a time window common for every time series, the age interval 63-68 was studied to be the trade-off between the width of the window (or the information carried by each time series) and the number of subjects (or time series) in the study. A total of 390 women met these requirements (Supplementary figure 1). For the 390 selected women, cubic splines, defined in the age interval 63-68, have been used to specify the functions required by the fda package for the following mammographic density parameters (dependent variables) treated as logarithm: volumetric percent density (VPD); absolute dense volume (DV, cm 3 ); absolute non dense volume (NDV, cm 3 ). Cubic splines for the VPD are reported on Supplementary figure 2.
The obtained curves served the purpose to estimate FoS models that explain the effect of the BMI (β(t)) on VPD, DV, and NDV over time, respectively. Two FoS models are proposed for each dependent variable according to the value of the BMI measured at age 20 (Model 1) and at age 40 (Model 2). An analytical evaluation of the confidence intervals of the estimated β(t) was not feasible given the nature of the data. To overcome this issue we applied the bootstrap technique [3] with 500 resampling.